A Gaussian process latent variable model formulation of canonical correlation analysis
نویسندگان
چکیده
We investigate a nonparametric model with which to visualize the relationship between two datasets. We base our model on Gaussian Process Latent Variable Models (GPLVM)[1],[2], a probabilistically defined latent variable model which takes the alternative approach of marginalizing the parameters and optimizing the latent variables; we optimize a latent variable set for each dataset, which preserves the correlations between the datasets, resulting in a GPLVM formulation of canonical correlation analysis which can be nonlinearised by choice of covariance function.
منابع مشابه
Shared Gaussian Process Latent Variables Models
A fundamental task is machine learning is modeling the relationship between different observation spaces. Dimensionality reduction is the task reducing the number of dimensions in a parameterization of a data-set. In this thesis we are interested in the cross-road between these two tasks: shared dimensionality reduction. Shared dimensionality reduction aims to represent multiple observation spa...
متن کاملA Probabilistic Interpretation of Canonical Correlation Analysis
We give a probabilistic interpretation of canonical correlation (CCA) analysis as a latent variable model for two Gaussian random vectors. Our interpretation is similar to the probabilistic interpretation of principal component analysis (Tipping and Bishop, 1999, Roweis, 1998). In addition, we can interpret Fisher linear discriminant analysis (LDA) as CCA between appropriately defined vectors.
متن کاملFOLS: Factorized Orthogonal Latent Spaces
Many machine learning problems inherently involve multiple views. Kernel combination approaches to multiview learning [1] are particularly effective when the views are independent. In contrast, other methods take advantage of the dependencies in the data. The best-known example is Canonical Correlation Analysis (CCA), which learns latent representations of the views whose correlation is maximal...
متن کاملGaussian Process Latent Variable Models for Visualisation of High Dimensional Data
In this paper we introduce a new underlying probabilistic model for principal component analysis (PCA). Our formulation interprets PCA as a particular Gaussian process prior on a mapping from a latent space to the observed data-space. We show that if the prior’s covariance function constrains the mappings to be linear the model is equivalent to PCA, we then extend the model by considering less ...
متن کاملA Variational Bayesian Formulation for GTM: Theoretical Foundations
Generative Topographic Mapping (GTM) is a non-linear latent variable model of the manifold learning family that provides simultaneous visualization and clustering of high-dimensional data. It was originally formulated as a constrained mixture of Gaussian distributions, for which the adaptive parameters were determined by Maximum Likelihood (ML), using the Expectation-Maximization (EM) algorithm...
متن کامل